A new cepstral prefiltering technique for estimating time delay under reverberant conditions
نویسندگان
چکیده
A microphone array can be used for hands-free acquisition of speech under reverberant conditions. This requires knowledge about the desired talker location, which can be obtained by estimating the time delays between the signals received by one or more pairs of spatially separated microphones. However, in a typical audio-conference room, strong reverberation is usually present and can have disastrous eeects on the performance of conventional time delay estimation (TDE) methods. In this article, we present and evaluate a new cepstral preeltering technique which can be applied on the received signals before the actual TDE in order to obtain a more accurate estimate of the delay in a typical reverberant environment. The technique is based on the estimation and the subtraction of the minimum-phase component (MPC) of the channel cepstrum from the total cepstrum of each microphone signal. So, in the same way that it is necessary in certain TDE methods to estimate the power spectral densities of the signals of interest from the received data, the new method requires the estimation of the channel MPC in the cepstral domain. The performances of a TDE system with and without cepstral preeltering are compared via Monte-Carlo simulations for xed random and speech sources as well as for a moving random source. The results clearly demonstrate the beneecial eeects of the new cepstral preeltering technique on TDE performance when the source is xed or slowly moving. R esum e: Un r eseau de microphones peut ^ etre utilis e lors de la r eception mains-libres de signaux de parole en milieu r everb erant. Ceci n ecessite la connaissance de la position du locuteur, qui peut ^ etre obtenue en estimant les d elais de propagation entre les signaux ree cus par plusieurs paires de microphones. Cependant, dans une salle de t el e-conf erence typique, un fort niveau de r everb eration est habituellement pr esent et peut avoir des eeets d esastreux sur la performance des m ethodes d'estimation de d elai (ED) conventionnelles. Dans cet article, nous pr esentons et evaluons une nouvelle technique de pr eeltrage cepstral pouvant ^ etre appliqu ee aux signaux ree cus avant l'ED de faa con a obtenir des estim es de d elai plus pr ecis en milieu r everb erant. Cette technique est bas ee sur l'estimation de la composante en phase minimale (CPM) du cepstre du canal de transmission, que l'on …
منابع مشابه
Cepstral prefiltering for time delay estimation in reverberant environments
Time delay estimation (TDE) between the signals received by two or more spatially separated microphones can be used as a means for the passive localization of the dominant talker in applications such as audio-conference. However, in a recent study, it has been shown that reverberation can have disastrous eeects on TDE performance. In this paper , we develop and evaluate a new cepstral preelteri...
متن کاملRobust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique
In this paper we introduce a robust feature extractor, dubbed as robust compressive gammachirp filterbank cepstral coefficients (RCGCC), based on an asymmetric and level-dependent compressive gammachirp filterbank and a sigmoid shape weighting rule for the enhancement of speech spectra in the auditory domain. The goal of this work is to improve the robustness of speech recognition systems in ad...
متن کاملA Wavelet-based Gcc Prefiltering Algorithm for Speech Doa Estimation
The phase transform prefilter based generalized cross-correlation is adopted widely in the estimation of the time-differences-of-arrival, since it performs very well under reverberant environment with low noise condition. However, it is not robust to high noise environments. In this paper, we propose to prefilter the signals in the wavelet domain with the aim of estimating the direction-of-arri...
متن کاملThe Use of Locally Normalized Cepstral Coefficients (LNCC) to Improve Speaker Recognition Accuracy in Highly Reverberant Rooms
We describe the ability of LNCC features (Locally Normalized Cepstral Coefficients) to improve speaker recognition accuracy in highly reverberant environments. We used a realistic test environment, in which we changed the number and nature of reflective surfaces in the room, creating four increasingly reverberant times from approximately 1 to 9 seconds. In this room, we re-recorded reverberated...
متن کاملRobust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum
The goal of this work is to improve the robustness of speech recognition systems in additive noise and real-time reverberant environments. In this paper we present a compressive gammachirp filter-bank-based feature extractor that incorporates a method for the enhancement of auditory spectrum and a shorttime feature normalization technique, which, by adjusting the scale and mean of cepstral feat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Signal Processing
دوره 59 شماره
صفحات -
تاریخ انتشار 1997